Signiicant Lexical Relationships
نویسندگان
چکیده
Statistical NLP inevitably deals with a large number of rare events. As a consequence, NLP data often violates the assumptions implicit in traditional statistical procedures such as signi cance testing. We describe a signi cance test, an exact conditional test, that is appropriate for NLP data and can be performed using freely available software. We apply this test to the study of lexical relationships and demonstrate that the results obtained using this test are both theoretically more reliable and di erent from the results obtained using previously applied tests.
منابع مشابه
Comparing Lexical Relationships Observed within Japanese Collocation Data and Japanese Word Association Norms
While large-scale corpora and various corpus query tools have long been recognized as essential language resources, the value of word association norms as language resources has been largely overlooked. This paper conducts some initial comparisons of the lexical relationships observed within Japanese collocation data extracted from a large corpus using the Japanese language version of the Sketc...
متن کاملAutomatic generation of probabilistic relationships for improving schema matching
Schema matching is the problem of finding relationships among concepts across data sources that are heterogeneous in format and in structure. Starting from the ‘‘hidden meaning’’ associated with schema labels (i.e. class/attribute names), it is possible to discover lexical relationships among the elements of different schemata. In this work, we propose an automatic method aimed at discovering p...
متن کاملIdentifying Lexical Relationships and Entailments with Distributional Semantics
As the field of Natural Language Processing has developed, research has progressed on ambitious semantic tasks like Recognizing Textual Entailment (RTE). Systems that approach these tasks may perform sophisticated inference between sentences, but often depend heavily on lexical resources like WordNet to provide critical information about relationships and entailments between lexical items. Howe...
متن کاملDealing with Uncertainty in Lexical Annotation
We present ALA, a tool for the automatic lexical annotation (i.e. annotation w.r.t. a thesaurus/lexical resource) of structured and semi-structured data sources and the discovery of probabilistic lexical relationships in a data integration environment. ALA performs automatic lexical annotation through the use of probabilistic annotations, i.e. an annotation is associated to a probability value....
متن کاملLexical acquisition and clustering of word senses to conceptual lexicon construction
We describe a mechanism and an algorithm to support construction of a large complex conceptual lexicon from an existing alphabetical lexicon. As part of this research, we define lexical models to present words and lexicons. Given the fact that an alphabetical lexicon contains lexical information about words which are organized by their spelling, constructing a conceptual lexicon requires an ide...
متن کامل